12. Quiz: Goals and Rewards

Quiz: Goals and Rewards

So far, you've seen one example for how to frame an agent's goal as the maximization of expected cumulative reward. In this quiz, you will investigate several more examples.

Source: Wikipedia

Source: Wikipedia

Escape the Maze

Consider an agent who would like to learn to escape a maze. Which reward signals will encourage the agent to escape the maze as quickly as possible? Select all that apply.

SOLUTION:
  • The reward is -1 for every time step that the agent spends inside the maze. Once the agent escapes, the episode terminates.
  • The reward is -1 for every time step that the agent spends inside the maze. Once the agent escapes, it receives a reward of +10, and the episode terminates.

Source: Wikipedia

Source: Wikipedia

Consider an agent who would like to learn to play a board game (like backgammon, chess, or checkers). Which reward signals will encourage the agent to win the game? Select all that apply.

SOLUTION:
  • The agent receives a reward only at the end of the game, and receives a reward of +1 if it wins, -1 if it loses, and 0 if the game is a draw.
  • The agent receives a reward only at the end of the game, and receives a reward of +10 if it wins, -10 if it loses, and 0 if the game is a draw.

Consider an agent who would like to learn to balance a plate of food on her head. Which reward signals will encourage the agent to keep the plate balanced for as long as possible? Select all that apply.

SOLUTION:
  • The reward is +1 for every time step that the agent keeps the plate balanced on her head. If the plate falls, the episode terminates.